List of AI News about OpenAI hallucination rate
| Time | Details |
|---|---|
|
2026-01-08 11:23 |
PersonQA Benchmark Reveals Increasing Hallucination Rates in OpenAI Models: o1 vs o3 vs o4-mini
According to God of Prompt (@godofprompt), recent results from the PersonQA benchmark demonstrate a concerning trend in OpenAI's large language models. The hallucination rate increased significantly with each new model iteration: OpenAI o1 exhibited a 16% hallucination rate, o3 rose to 33%, and o4-mini reached 48%. These findings suggest that newer versions are not addressing, and may even be amplifying, the issue of factual inaccuracy in AI-generated content. This trend exposes a critical challenge for enterprise AI adoption, as increased hallucinations can undermine trust, limit business applications in sensitive domains, and raise regulatory concerns. Companies deploying OpenAI models should carefully evaluate model performance on domain-specific benchmarks and demand transparency in model updates to mitigate risks. (Source: God of Prompt @godofprompt, Jan 8, 2026) |